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[57] ABSTRACT 

A Shared Data Access Serialization mechanism for 
sharing data among a plurality of systems while main- 
taining data integrity. User data is maintained on a pri- 
mary and optionally an alternate data store. Each data 
store contains a set of lock blocks, one for each system 
sharing the data. The contents of the lock blocks, nor- 
mally a time-of-day value, indicate system ownership 
status of the associated data. *'Lock Rules" are disclosed 
for determining resource ownership, as well as a **lock 
stealing*' mechanism for obtaining resource ownership 
from a temporarily stopped system. Suffix records and 
check records are used to insure data integrity. Error 
indications deduced from inconsistent suffix and/or 
check records are used to trigger a data recovery mech- 
anism, and the recovery mechanism can synchronize a 
primary and secondary data store without the necessity 
of suspending access to the primary during the synchro- 
nization process. 
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FIG.26C 
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FIG. 27 
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identical to (duplexed with) the primary data store. 
SHAI^ ACCESS SERIALIZATION FEATURING When present, the alternate dau store provides a 
SECOND PROCESS LOCK STEAL AM> greater level of data availability by allowing re-creation 

SUBSEQUENT WRITE ACCESS DENIAL TO FIRST of damaged data. The alternate data store also is used in 
PROCESS 5 "lock stealing", wherein a second sharing system, re- 

quiring a resource locked by a first system, can "steal'* 
This is a continuation of copending application(s) Ser. the lock formerly held by the first system, indicate that 
No. 07/548,516 filed on Jul. 2, 1990. the lock is "stolen" by manipulation of the control infor- 

BAmrr-orMTMrk ni? tuc TMvcxrrroKj mation in the primary and alternate data store, and so 

BACKGROUND OF THE INVENTION ^^^-^ ^^^^ ^^^^^ ^^^^^ processing in a 

1. Field of the Invention nondisruptive manner. 

This invention relates to operating systems for com- It is an object of this invention to permit the efficient 

puters and computer complexes. More particularly, this sharing of data among different users of a system or 

invention describes a mechanism and process for shar- complex of systems, or sysplex. 

ingdata among different users of the computer or com- tS Another object of this invention is to make shared 

puter complex. data, locked by a stopped system, available to other 

2. Background Art systems in a sysplex. 

Data sharing in a multiprocessor complex, or sysplex, Another object of this invention is to allow a tempo- 
is accomplished in a variety of ways depending on the rarily stopped system to resume operation in a nondis- 
location of the data to be shared, (e.g. main storage, or 20 ruptive fashion, though other systems have accessed 
DASD) the nature of the access permitted (e.g., read- data that system had locked prior to its stoppage, 
only; read-write; etc.) and the granularity of sharing Another object of this invention is to recover dam- 
(e.g^ device level sharing, data set level sharing, or aged data without the necessity of switching to an alter- 
record level sharing). Common to virtually all sharing nate, or backup, data store. 

mechanisms is a locking mechanism, to insure that data 25 A further object of this invention is to synchronize a 
being modified by one user is not referenced by another new alternate data store without the necessity of tempo- 
user until the modification is complete. An example of a rarily suspending access to the primary data store dur- 
lockmg mechanism used in the IBM MVS system is the ing the synchronization process. 

S^^sltnT^^^^^^^ 30 BRIEF DESCRIPTION OF THE DRAWINGS 

Development Macro Reference (GC28- 1857-1), RE- FIG. 1 is a block diagram indicating the relationship 

SERVE permits the reservation of a device for use by a between the Shared Data Access Serialization user 

particular task on a particular system. Using this mecha- program, the user data on DASD, and the instances of 

nism, all data sets on a RESERVEd device will be the Shared Data Access Serialization on each system, 

unavailable to other tasks on other systems until the 35 Although only two systems are depicted, more than 

owning task releases the device. While insuring data two systems can participate in this relationship, 

integrity, the granularity of this niechanism is clearly FIG, 2 is a block diagram depicting the services avail - 

not fine. Another mechanism. ENQ/DEQ (described in able to the Shared Data Access Serialization user, 

the same publication), allows a user to define and simi- FIG. 3 is a block diagram depicting the content of the 

larly control a serially reusable resource (i.e., a resource 40 shared data store used by Shared Data Access Serializa- 

that can be shared among users, though only one can tion in managing user data. . 

access it at a time). Still another scheme is described in FIG. 4 is a block diagram showing the control struc- 

IBM Technical Disclosure Bulletin, Vol. 22, No. 6, tures relating to each user data item. 

Nov. 1979, at pp. 2571-2573: this mechanism provides FIG. 5 is a block diagram showing the structure used 

for record-level sharing of a data-set across different 45 to control access to a resource. 

systems by means of the storing of a user-unique key. FIG. 6 is a block diagram showing the structure used 

along with a time-of-day indicator, in an access recor- to ensure data integrity of the resource. Both Suffix and 

d— to serve as a lock indicator to subsequent accessors, check records have the same structure. 

Common to these and most similar schemes is a defi- FIG. 7 is a flow diagram illustrating control flow for 

ciency in that a resource locked by a user can become 50 READ request processing. 

lost to other users if the system of the locking user FIGS. 8 A and 8B are flow diagrams illustrating con- 
becomes disabled for an extended period of time. Fur- trol flow for READ DATA processing, 
ther, they provide no facilities or assistance in the event FIGS. 9A and 9B are flow diagrams illustrating con- 
of damage to the data, nor do they deal with the situa- trol flow for permanent error processing, 
tion where a backup data set is to be maintained for 55 FIGS. IDA, lOB and IOC are flow diagrams illustrat- 
availability purposes. ing control flow for READ SERIALIZED request 



SUMMARY OF THE INVENTION 



processing. 

FIG. 11 is a flow diagram illustrating control flow for 
Id accordance with this invention, shared user data is read lock blocks processing, 
stored on a primary data store. The primary data store 60 FIG. 12 is a flow diagram illustrating control flow for 
contains control information in the form of lock blocks fix lock blocks processing. 

(one associated with each data-sharing system), suffix FIG. 13 is a flow diagram illustrating control flow for 
records, and check records. The suffix and check re- write lock blocks processing. 

cords are used to insure the integrity of the user data, FIGS. 14A and 14B are flow diagrams illustrating 
and the lock blocks are used to allow a single sharing 65 control flow for lock owner signal processing, 
system to "lock" the data when necessary. FIGS. ISA, 15B and 15C are flow diagrams illustrat- 

The invention further provides for an alternate data ing control flow for WRITE SERIALIZED request 
store, also containing control information, and initially processing. 
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FIG. 16 is a flow diagram illustrating control flow for 
write data processing. 

FIG. 17 is a flow diagram illustrating control flow for 
UNLOCK request processing. 

FIGS. 18A and 18B are flow diagrams illustrating 5 
control flow for lock steal processing. 

FIGS. 19A and 19B are tables showing control flow 
for a read serialized example from multiple systems for 
the same resource. 

FIG. 20 is a table showing the lock block state 10 
changes for the example of FIG. 19. 

FIGS. 31A, 21B, 31Q 31D and 21E are tables show- 
ing control flow for a first lock steal processing exam- 
pie, 

FIG. 22 is a Ublc showing the lock block state 15 
changes for the example of FIG. 21. 

FIGS. 23A. 23B, 23C. 23D. 23E and 23F are tables 
showing control flow for a second lock steal processing 
example. 

FIG. 24 is a table showing the lock block state 20 
changes for the example of FIG. 23. 

FIG. 25 is a block diagram showing the control block 
structure for the primary store descriptor, the alternate 
store descriptor, and lock request elements. 

FIGS. 26A. 26B, 26C, 26D, and 26E are flow dia- 25 
grams showing control flow for **New Alternate" pro- 
cessing. 

FIG. 27 is a flow diagram showing control flow for 
detecting that lock steal processing is required for a 
lock request element. 30 

DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

FIG. 1 shows a high level view of Shared Data Ac- 
cess Serialization. Shared Data Access Serialization 35 
provides for serialization of updates to the user data 
maintained on shared DASD. This serialization is per- 
formed at a low level of granularity, a single user record 
or resource, while insuring fairness of access, and toler- 
ating failures. 40 

The data store used to maintain the user resources is 
typically contained on a DASD device. Shared Data 
Access Serialization maintains global control informa- 
tion in the data store (FIG. 3, item 302). A primary data 
store (FIG, 1 at 103) contains user data, which may 45 
optionally be duplexed in an alternate data store (FIG. 
1 at 104). Each data store contains its own control infor- 
mation. Therefore, both the primary and the alternate 
data store will contain control information used to ac- 
cess the data in that particular data store (see the discus* 50 
sion on data duplexing for more information on primary 
and alternate data stores). This information includes a 
map of logical resource names to physical location and 
data attributes, and is used to map the name of the data 
given on a user request to the physical data behind the 55 
resource of interest. This conventional mapping is not 
relevant to the actual serialization protocols and will 
not be discussed funher. 

Each resource contained in a data store has several 
types of resource level control information associated 60 
with it This control information is used to control ac- 
cess to the resource, and is used to control recovery and 
reliability of the user data. (Sec FIG. 3 at 303.) 

The access control information comprises a set of 
lock blocks. There is one lock block for each system 65 
that may be involved in sharing the resource. FIG. 4 
shows the relationship of the lock blocks, 401, to the 
resource's user data, 402. This figure also shows that 



there is one lock block for each system which may share 
the resource. In this example there are "n** systems 
which may share the resource. Each system is assigned 
ownership of a lock block and only one system can own 
any given lock block. Ownership of a lock block is 
assigned at system initialization time using information 
contained in the global control information of the data 
store. A system assigned lock block three owns lock 
block three. Ownership of a lock block position and 
ownership of a specific resource are not the same. Own- 
ership of a lock block position does not imply owner- 
ship of a resource. 

A lock block is a countAcy/daU record. The key 
portion of the record is the critical part in the serializa- 
tion process; the daU part of the record is not imporunt 
and not needed in the serialization process. As shown in 
FIG, 5, the key, 501, comprises a system sequence num- 
ber, uniquely assigned at system initialization using 
information contained in the global control information, 
and a time-of-<lay (TOD) value, used to order requests 
to access the resource. TTie normal state of a lock block 
is unlocked or all zeros. The locked state, which shows 
intent to update the resource, is reflected by a nonzero 
key field in at least one lock block. 

In addition to access control information, the re- 
source contains control information which is used to 
ensure data integrity. Suffix records are used to ensure 
that the user data contained in the associated physical 
data record is complete and consistent. As shown in 
FIG. 4 at 404A and 404B, the suffix record is physically 
part of the user data record, although not visible to the 
user of the data. The single record write time-of-day 
(TOD) value maintained in the suffix record, FIG. 6 at 
604, is used to determine the completeness or consis- 
tency of a single physical record, The check record, 
shown in FIG. 4 at 403, in addition to the suffix record, 
is used to ensure data integrity for a multi-record write 
operation. The suffix record alone is not sufficient to 
ensure data integrity when more than one physical 
block is written for a request. All physical records actu- 
ally written may have been written successfully, but the 
resource as a whole will be inconsistent if all the in- 
tended physical records comprising the multi*record 
write operation have not been written. The multi- 
record write time-of-day (TOD) value maintained in 
the check record and each suffix record are used to 
ensure that all the physical blocks comprising a re* 
source are logically consistent. 

Sequence numbers are maintained in both the suffix 
and the check records. There is one sequence number 
field for each system that shares the data store. The 
sequence number is used by a system to determine if the 
data that it has written to the primary data store has 
been propagated to alternate data store during certain 
error conditions. During recovery from these error 
conditions, the use of these sequence numbers allows 
Shared Data Access Serialization to report to its user 
that either a user change has been successfully written 
to the data store or that the change was not made and 
the user must restart the update sequence of read seriali- 
zed/change data/write serialized. 

The data store used to maintain Shared Data Access 
Serialization data must have certain properties. Once 
the data store device (a DASD device) starts a request 
from one system it must either complete the request 
from that system or terminate the request with an error 
prior to initiating a request from another system. That 
is, requests must not be interlaced by the data store 
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device. The operation of the request may be halted at made for activating another data store as an alternate 
any point but the interruption must be reported as an and synchronizing this new alternate with the primary, 
exception to the Shared Data Access Serialization. Data Local error recovery is attempted when it appears 
written to the data store must be processed in the order that an error is isolated to a specific record or set of 
presented to the data store. That is, the data presented 5 records on the primary data store. For example, a re- 
to the data store may not be transferred from the pro- cord has been successfully read (that is, data transfer 
cessor to the record on the data store in random or from the data store to the processor was successful) but 
interleaved order. This is because the suffix record, the suffix record indicates that the user data is not logi- 
physically the last part of the data store record, is used cally consistent. This could occur if another system 
for consistency checking. In order for the consistency 10 sharing the data store failed during data transfer to the 
checking to work correctly, the suffix record must be data store and did not transfer a complete consistent 
written after all user data has been written. If the data record. The suffix record in this case would not contain 
transfer to a data store must be terminated prior to the expected values, thus indicating that the data is 
completion of the entire data transfer, the data that was possibly invalid. Another case of local failure is when a 
nottransferredmustbesettozerosonthedatastore.lt 15 multi-record write is started but not completed by a 
is this inconsistent data in the check record (a single system (a multi-record write is required when user data 
record write time of day cannot be zero, FIG. 6 at 604) spans more than one physical record). In this case, each 
that allows for data consistency checking. The data physical record that comprises the logical record is 
store device must allow for conditional requests. That complete. The sufHx record for any of these physical 
is, the request must be able to test the value of a lock 20 records would indicate that the physical record is valid, 
(the value of a record's key) and conditionally execute However, the set of physical records is not consistent, 
further request operations (such as writing data to some This is detected by the fact that some of the multi-write 
other record). time-of-day values contained in the suffix records will 

Data duplexing allows Shared Data Access Serializa- not match the multi-write time-of-day value contained 
tion to provide for error correction and recovery. Data 25 in the check record. (See FIG. 4 at 403 and FIG. 6 at 
duplexing is an option of Shared Data Access Serializa- 603.) 

tion and is not required to provide for serialized access Not6 that the above local problems would normally 
to data. If one elects to not use data duplexing, many of be detected by the system which experienced the prob- 
the error recovery features of Shared Data Access Seri- lem causing the bad data store information and that 
alization are lost resulting in an exposure to system 30 system would attempt recovery. The suffix records and 
availability. check records allow any system to detect the error and 

Shared Data Access Serialization uses two data stores attempt the recovery. This is an important attribute 
to achieve duplexing, a primary (FIG. 1 at 103) and an since the system where the error originated might suffer 
alternate (104), The primary and alternate data stores problems that prevent it from correcting the error and 
are synchronized during initialization of the first system 35 possibly from reporting the error to the other systems 
to use the data stores. Synchronization ensures that all sharing the data store. Were it not for the suffix and 
the data on the primary data store is copied to the alter- check record processing, the user data on the data store 
nate data store. The alternate data store is then capable would not be reliable. 

of being used as the primary data store should an uncor- Shared Data Access Serialization provides the user 
rectablc error occur on the primary. During normal 40 several services which allow access to a resource; 
processing, data written to the primary dau store is also READ, READ SERIALIZED, WRITE SERIAL- 
written to the alternate data store. This maintains the IZED and UNLOCK. 

alternate data store as a reliable replacement and backup READ is an unserialized access of a named resource, 
for the primary. It does not prevent access of the same resource from 

An alternate data store may be made available to 45 other systems. The READ service does not alter the 
Shared Data Access Serialization either when Shared lock state of the resource nor does it look at the lock 
Data Access Serialization is first initialized or at any state of the resource. Therefore, READ can be issued 
time thereafter. Making an alternate data store available for a resource that is currently locked by another sys- 
is not disruptive to the normal processing of user re- tem and complete prior to the unlock of the resource by 
quests to serialize a resource. That is, user requests are 50 the other system. 

processed without excessive delay while Shared Data READ SERIALIZED is a controlled access to a 
Access Serialization is synchronizing the alternate data named resource and implies an intent to update the 
store with the primary data store. resource. Since the resource may be changed, serializa* 

The alternate data store is used to provide for both tion must be obtained prior to accessing the data. The 
. error correction and recovery of any error encountered 55 resource's lock, FIG. 4 at 401, and FIG. 5, must first be 
on the primary. This recovery is accomplished at either obtained. The resource may not be read until the lock 
a local or a global level. has been obtained. That is, this system must own the 

Global eri-or recovery is necessary when an uncor- lock, and therefore the resource, prior to reading any 
rectable error occurs on the primary data store. This data associated with the resource, 
includes errors that prevent the entire data store from 60 WRITE SERIALIZED is a controlled update of a 
being accessed, such as loss of all paths to the data store, named resource. The resource must have been accessed 
or when local recovery of a record or set of records by a preceding READ SERIALIZED request. The 
fails» which can occur as the result of a defective storage resource is updated only if the lock obtained by the 
media (bad data record). Global recovery requires all preceding READ SERIALIZED request is still held 
the systems sharing the damaged primary data store to 65 by this system. Once the data is successfully written, 
stop using it. These systems will then discontinue use of this system's lock is released; the resource is unlocked 
the primary to satisfy data requests and begin use of the for this system. The next system waiting for the lock, if 
full functional alternate data store. Provisions are also there is another system waiting, is informed that it is 
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now the oy^er of the r«our^^^^^^ lock is no longer ^g^^ REQUEST PROCESSING 
held, the WRITE SERIALIZED request is failed and 

the requestor is notified of this action. The resource is READ Operation 

not updated when the lock is no longer held. The READ operation (explained more fully in FIG. 

UNLOCK is a way to release the lock obtained for a ^ 7 and the accompanying text below) allows the user to 

named resource by this system via a READ SERIAL- access a resource stored on the data store. The user has 

IZED request without altering the contents of the user indicated by using the READ operation that there is no 

data. The user data portion of the resource maintained intent to update the information read and therefore ho 

by Shared DaU Access Serialization is not modified. ,q reason to prevent access of the data by any other sys- 
Thc resource is released or unlocked for this system. 

The next system waiting for the lock, if there is another Shared Data Access Serialization conventionally uses 

system waiting, is informed that it is now the owner of ^he resource name or identifier from the read request 

the resource ^ global control mformation (302) from the data 

— . * * 1 1 J * ■ store to locate the specific resource in the data store and 

The ability of a system to lock user daU is a very 15 , . , ^, . .*»^w»*-* ^^r^^^A 

' . ^ - . . ui * u *u J / to determine the resource s attributes (such as record 

important one m terms of being able to share the daU (Q^m^i and size) 

between two or more systems. Shared Data Access ^^^^^ referenced at various times during 
Scnalization provides this ability at a low level of gran- j^^^^ processing. While accessing the data store, 
ularity. However, in order to have continuous availabil- re^jj ^^^y experience problems which require that 
ity of the dau with no operator involvement some ^^ror recovery be initiated on the data store. A READ 
means must be provided to protect the user daU from operation may encounter problems accessing the lock 
becoming inaccessible. Inaccessibility of data can hap- block information on the data store or referencing the 
pen if the data was locked by a system which is cither no resource data. See the discussion and the end of the 
longer functioning and therefore unable to release the READ SERIALIZED section for information on re- 
lock or when a system is delayed for an extensive period source data recovery. 

of tmie p^sibly be being stopped. READ SERIALIZED Operation 

Shared Data Access Serialization provides, transpar- 
ent to either the user of the data or the system operator, The READ SERIALIZED operation (outlined in 
a mechanism whereby a system waiting for a resource jo tWs section and explained more fully in FIG. 10 and the 
for too long can safely take ownership of that resource accompanying text below) and the WRITE SERIAL- 
away from another system. This is called stealing the IZED service (outlined in the next section) allow the 
lock. Stealing the lock must be accomplished in such a " «=Pf « "P**"' « resource in a con- 

way as to not jeopardire the integrity of the data associ- trolled senahzed manner. In order for the user to update 
ated with the lock. The system from which the lock is 35 f sha«d resource on the dau store, the »«™t firs 

. *v ui * *• *• J I, lock the resource and read the resource m from the data 

stolenmustnotbeabletocontmueexecutionand^^^^^^ ^^^^^ accomplished by requesting Shared Data 

mg It owns the lock, wntc data to the resource. That is. ^^^^ Serialization to perform a READ SERIAL- 
steahng the lock must work just as well against a tempo- ^^^^ ^ ^^^-^^^ resource. When 

rarily slopped system as one that is no longer running. ^ gj,^^^ j^^^^^ Serialization has completed the 
In addition, the temporanly stopped system must be reaD SERIALIZED request, the resource has been 
able to recover from having the lock stolen from it. locked, preventing updates of this resource by other 

Lock steal uses the access control information associ- users of the data store, and read in from the data store, 
ated with the resource to ensure data integrity of a The resource is available for processing by the user, 
resource while still allowing the resource to be assigned 45 When the reader finishes processing the resource, any 
to a new owner. See FIG. 4, item 401A. 401B, etc. and changes to the resource can be committed to the data 
FIG. 5. The new owner is also selected by using the store by requesting via a WRITE SERIALIZED re- 
access control information. quest that Shared Data Access Serialization update the 

The ability to steal a lock is dependent on the usage of resource. Sec the next section for more details on the 
the access control information by the READ SERIAL- 50 WRITE SERIALIZED service, 
IZED and WRITE SERIALIZED operation. (See the After processing the resource, the user may decide 
READ SERIALIZED and WRITE SERIALIZED that no changes are to be made to the resource on the 
sections of this document for more information on their data store. The user may invoke the Shared DaU Ac- 
particular processing.) The READ SERIALIZED cess Serialization UNLOCK service to release the scri- 
operation records information in a lock block associated 5* aji^tion on the resourc^ without commitUng any 

with the resource and gains ownership of the resource f^.f^ '"f'^^"""' ^"^"^ 

. ,u J * J * • J • details on the UNLOCK service, 
pnor to reading m the resource data. The daU is readin ^^^^^ Serialization conventionally uses 
from the data store and is presented to the user. The ^^^^^ .^^^^^^^ ^^^^ ^^^^^ S^j^^I, 
user updates the data, then invokes the WRITE SERI- ^ ^LIZED request and the global control information 
ALIZED operation requestmg that the data be written. ^^^^ ^^^^^ j^^^ ^j^^ ^p^i^^ resource in 
WRITE SERIALIZED then writes the data to the data 5^^,^ determine the resource's attributes 
store ensuring that the lock contains the same value as ^su^h as record format and size), 
set by the corresponding READ SERIALIZED opera- since this is a serialized read, the resource lock, FIG. 
tion. The update attempt is failed if the lock docs not ^5 4 at 401A, 401B, etc.. must be obtained prior to access- 
contain the correct information. The lock will not con- ing any data. Obuining the lock is a multiple step opera- 
tain the correct information when a lock steal has oc- tion. The first step in obtaining a resource is to generate 
curred since lock steal alters the contents of the lock. this system's lock key and to record this lock key into 
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the appropriate lock block in the resource's access con- 
trol information. The appropriate lock block to be used 
by this system was detennined at system initialization 
time. Each system owns one of the lock blocks associ- 
ated with a resource (401A, 401B, etc.). Ownership is 5 
detennined from information contained in the global 
control information of the data store. The same lock 
block in the set of lock blocks for a resource (401C for 
example) is owned by a system for all resources. 

The lock key comprises a system sequence number 10 
and a time-of-day value. (See FIG. 5 at 501.) The system 
sequence number is obtained at system initialization 
time and is unique for each system sharing the data 
store. The global control infonnation contained in the 
data store is incremented by initializing systems to gen- 
erate the next unique system sequence number. A sys- 
tem will use the same system sequence number through- 
out its processing. (See FIG. 3 at 302.) The time-of-day 
value is a time obtained from the processor. A new time 
value is obtained each time a lock key is to be obtained 
for a user by Shared Data Access Serialization. 

Once the key has been generated, it is written to the 
appropriate lock block in the data store. This is the first 
step in obtaining a lock and records in the data store the 
intent of this system to use the resource associated with 
this set of lock blocks. 

Thus far, all Shared Data Access has done is indicate 
its intent to use the resource. This system must now 
determine who is the owner of the resource. This is 
called lock rule processing. There are three lock rules 
that are used to determine ownership. All the lock 
blocks associated with the resource are read from the 
data store. By the first lock rule, a system is the resource 
owner if all but that system's lock blocks are zero. No 35 
other system is interested in this resource. 

By the second lock rule, a system is defmitely not the 
owner if at least one lock block other than the system's . 
lock block is not zero (at least one other system is inter- 
ested in the resource) and all nonzero lock blocks have 4Q 
time-of-day values that are older than the system's time- 
of-day value. This system must wait for use of the re- 
source. Lock ownership will be passed to this system by 
another system when the other system resets its lock 
block lock key to zero and determines which system is 45 
the next system in order to use the resource. That is, this 
system will wait until all systems with older time-of-day 
values have processed, each in turn. 

The third lock rule deals with the case where at least 
one other system's time-of-day value is younger than a 50 
first system's time-of-day value. This condition can arise 
because the act of obtaining a time stamp and recording 
interest in a resource are not atomic. That is, two re- 
quests may be interleaved. System A gets a time stamp 
followed by System B getting a time stamp and record- 55 
ing that lime stamp in the data store. Subsequently, 
System A records its time stainp and finds the time 
stamp of System B which is younger. 

The ownership state is indeterminate in this case. To 
resolve this, the first system will generate a new time-of- 60 
day value, produce an updated lock key using this new 
. time-of-day value and write its lock block with this 
updated lock key. When the lock block update is com- 
pliste, the first system will read the lock blocks for all 
systems and find the system with the oldest time-of-day 65 
value. The system with the oldest time-of-day value is 
informed that it is the owner. Informing a system that it 
is the owner when it already knows it is the owner is 
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acceptable. The first system then waits for notification 
that is has become the owner. 

Once a system has determined (lock rule one) that it 
is the owner or has been informed by another system 
that it is the owner Oock rule two or lock rule three), it 
can read the user data from the data store to complete 
the user request The user is informed that the data is 
available. 

As noted, the data store is referenced at various times 
during READ SERIALIZED processing. While ac- 
cessing the data store. READ SERIALIZED may 
experience problems which require that error recovery 
be initiated on the data store. The error recovery to be 
taken is dependent on which part of the data store is 
damaged. A READ SERIALIZED operation may 
encounter problems accessing the lock block informa- 
tion on the data store or referencing the resource data. 

If the lock blocks are damaged (detennined by an 
attempt to write or read the lock blocks resulting in a 
permanent read or write error), an attempt is made to 
reformat the lock blocks. Reformatting physically rec- 
reates the track containing the lock block information 
for this resource on the data store. Formatting writes 
are used instead of normal write operations. Since lock 
blocks control access to the resource, repair of the lock 
blocks must be done in such a way as to temporarily 
stop access to the specific resource whose lock blocks 
arc damaged. The current owner of the resource will 
lose ownership since all control information will be lost 
as a result of the repair. Waiters will lose their spot in 
line. The owner and all waiters will have to re-attempt 
to gain ownership. In effect, the repair of lock blocks 
becomes a mass steal of the lock from all systems cur- 
rently interested in this resource. Since lock block re- 
pair is effectively a steal operation, lock blocks are 
reformatted on both the alternate and the primary data 
stores, regardless of which data store the damage was 
detected on. This is necessary to safely effect the stop- 
ping of the usage of the resource. (See FIG. 18 for more 
detail on steal processing and FIGS. 11, 12 and 13 for 
more infonnation on lock block processing and repair.) 

If an eaor is encountered reading in the resource's 
data from the primary data store, an attempt is made to 
read the data in from the alternate data store. The alter- 
nate data store contains a duplexed image of the pri- 
mary data store. The error encountered could either be 
a permanent enor attempting to read the data from the 
primary data store or information contained in the suffix 
or check record could indicate that a physical record is 
not complete or that one of several physical records is 
not logically consistent with the other physical records 
comprising the resource. See FIG. 8 A for more infor- 
mation on the recovery attempted on a READ DATA 
operation. 

WRITE SERIALIZED Operation 

The WRITE SERIALIZED operation (explained 
more fully in FIG. 15 and the accompanying text be- 
low) allows the user to update a resource stored on the 
data store in a controlled manner. The user must read 
the data in via the READ SERIALIZED operation, 
update the data and then invoke the WRITE SERIAL- 
IZED operation to copy the changed data back to the 
data store. Shared Data Access Serialization will write 
the updated data to both the primary and the alternate 
data store. 

In order to process certain types of errors, Shared 
Data Access Seriahzation maintains data sequence num- 
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bcrs for each resource. Data sequence numbers reflect the issuer can retry his operation starting with the 
the level of data with respect to a given system. A new READ SERIALIZED (the lock must be reacquired), 
data sequence number is generated each time a system If the alternate WRITE operation completed success- 
writes dato to the data store. The data sequence num- fully prior to the alternate data store becoming the 
bers arc maintained in the check record. Sec FIG. 4, 5 primary dau store then the user changes are now re- 
item 403 and FIG. 6. item 602A. 602B. etc. There is one fleeted in the primary data store. The user is informed 
data sequence number for each system sharing the data that the WRITE SERIALIZED operation was sue- 
store. Each system uses its data sequence number to ccssful. The resource does not have to be unlocked 
determine the result of the daU update during certain since the primary with iu zero lock blocks is effectively 
error recovery operations. 10 an unlocked resource. 

A system incremenU iU data sequence number in the If the WRITE to the alternate data store was success- 
local copy of the check records (read in by the READ ful. or the alternate data set totally failed and is no 
SERIALIZED operation) in preparation for possible longer useable (Shared Data Access Serialization can 
recovery. continue without an alternate data store), or if no alter- 

Now that the new data sequence number is gener- 15 nate data store is being used (an alternate dato store is 
atcd, the primary data store can be updated. The UP- ' optional) then the user daU update has been success- 
DATE operation is comprised of the following se- fully processed. The WRITE SERIALIZED operation 
quence of actions. These actions must be done via a is completed by releasing the resources* lock for this 
single atomic operation to the data store. First, a sys- system. This is accomplished by invoking the unlock 
tern's lock block is checked to determine if the resource 20 operation which unlocks the resource for this system 
is still owned by this system. It must contain the same and communicates the success of the WRITE SERI- 
nonzcro value written by the READ SERIALIZED ALIZED operation to the user, 
request. Any change in the state of the lock blocks value If the WRITE to the alternate dato store was not 
indicates that another system has stolen the lock and successful because of the lock checking (the lock on the 
therefore the resource from this system. A stolen lock 25 alternate was not zero) then the lock was stolen from 
will prevent the data update portion of this operation this system. It must be determined if the lock was stolen 
from taking place. The WRITE SERIALIZED opera- from this WRITE SERIALIZED operation or from a 
tion is terminated and the user is informed that his UP- previous WRITE SERIALIZED operation on this 
DATE operation failed because of a stolen lock. The system. 

user can redo the operation storting from the READ 30 Steal processing, running on some system other than 
SERIALIZED operation. Second, assuming that the the one which owns the lock, will have determined that 
lock is still held, one or more user dato records, along the lock has been held too long by a single system (our 
with their corresponding sufTix records, will be written system) and will toke ownership of the resource away 
to the dato store. And last, the check record is written. from this system by altering the content of its lock block 
The check record contains, among other things, the 35 on both the primary and alternate dato stores. The con- 
updated dato sequence number for this system (the data tent of the lock block on the alternate dato store is modi- 
sequence number for other systems is unaltered). fied first, followed by altering the content or value of 

Once the primary data store has been updated, the the lock block on the primary dato store. Spcciflcally. 
alternate dato store can be updated. The alternate dau the alternate dato store lock block value is changed 
store update comprises the following sequence of ac- 40 from zero to the value found by steal in the primary dato 
tions. all accomplished with a single atomic operation store's lock block (the system sequence number and 
against the data store. First, this system's lock block is time-of-day value generated by READ SERIAL- 
chcckcd to determine if the resource is still owned by IZED). The primary dato store's lock block value is set 
this system. The lock block must be all zeros, the nor- to zeros. The system whose lock is being stolen may 
mal stote of the lock block on the alternate data store. 45 possibly complete its operation between the time steal 
Any nonzero value in the lock block indicates that determined who to steal from and the time it actually 
another system has stolen the lock and therefore the steals the lock. This is no problem for cither lock steal 
resource from this system. A stolen lock will prevent or the target system. The next operation against the 
the data update portion of this operation from toking resource from the target system must determine how- 
place. Second, assuming that the lock is still held, one or 50 ever whether steal was aimed at it or a previous request, 
more user dato records, along with their corresponding WRITE SERIALIZED must determine if steal was 
suffix records, will be written to the alternate dato store. aimed at this operation or a previous one. In order to do 
And last, the check record is written. this. WRITE SERIALIZED must examine the value in 

If the primary dato store failed, was removed, and the its alternate lock block. If the value of the lock (a copy 
alternate dato store became the primary dato store while 55 of what the READ SERIALIZED has placed in the 
this process was attempting to update dato on the alter- primary dato store's lock block) does not match what 
nate data store (now the primary dato store), this READ SERIALIZED wrote there then the lock steal 
WRITE SERIALIZED operation is terminated. What was aimed at a previous request. WRITE SERIAL- 
is reported to the user of the WRITE SERIALIZED IZED attempu to write the dato again as described 
operation is dependent on the completion stote of the 60 earlier, this time checking the alternate lock for this 
alternate data store update. If the alternate operation new value rather than the more normal zero value. If 
was not started when this WRITE SERIALIZED op- the write is successful this time, then the lock block on 
eration was interrupted for the primary dato store fail- the alternate dato store is reset to all zeros in order to 
ure, then the new primary data store (formerly the prevent repeated extra processing in the future, 
alternate) does not reflect the user change. This is cfTec- 65 If the value of the lock block in the alternate data 
tively a stolen lock since the all zero lock block of the store does match, then the lock was stolen from this 
alternate is now in the primary. The issuer of WRITE system for the current WRITE SERIALIZED opera- 
SERI ALIZED is informed that the lock was stolen and tion. The caller of the WRITE SERIALIZED service 
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on this system will have committed his changes to the pc An 

primary but not to. the alternate data store. However, KtAU 

the changes may not have been duplexed. Until it is In FIG- 7 at 701 a test is made whether any data 

certain that the changes are duplexed, the user cannot stores exist (That b» whether there is a primary data 

be informed that his operation completed. It cannot be 5 store. This is indicated in FIG. 25 by the "functional" 

assumed that the stealing system will accomplish the indicator 2502 in the Primary Store descriptor 2501.) If 

duplexing since the stealing system might not have '^^^ indicated that no primary existed, return 

completed the process of READ SERIALIZED, UP. ^ ^l^^^^'J^l^}^ indicated that a primary 

DATE and WRITE SERIALIZED that would ensure ^^T^ "'"J^^ evoked 702, to 

the duplexing of this system^s change. Therefore, L'^'^ fT r ?c a^^^ 

<>.«r..H n<»to A,^^«c •^cif^ fi.- s^ore. The details of READ DATA are shov^ m 

Shared Dau Access Serialization ensures the duplex- g ^^^^^ ^^^^^ ^^^^ ^ 

mg, without r^^^^^^ stole the lock. ^^^^^^^ ^^^^ uncorrectable failure of the 

I^^]^ c°i;^^ X^^^l^^^^^^u^^^^^^ primary data store. If there was, the test at 701 is reex- 

READ SERL\LIZED. This causes the lock to be rcac- ,5 e^uted as indicated above. If the daU was cither suc- 

quu-ed m a normal fashion and the data to be read m cessfully read from the primary or alternate, then return 

again. Note that the data may or may not have changed is made to the caller, 
several times since we started because of activity on 

other systems. The only goal is to ensure that it is du- READ DATA 

plexed at whatever level currently exists. After the 20 FIG. 8 illustrates the control flow for READ DATA 

READ SERIALIZED is complete, the WRITE SERI- processing. At 801, a channel program is built to read 

ALI2ED is attempted to both primary and alternate the resource from the primary data store. Then, 802, the 

data stores again. channel program is started and its completion is 

As noted, the data store is referenced at various times awaited. At 803, a test is made whether the data was 

during WRITE SERIALIZED processing. While ac- 25 successfully read by the channel program. If so, 804, a 

cessing the data store, WRITE SERIALIZED may determination is made if the data is reliable by examina- 

experience problems which require that error recovery ^ion of the suffix and check records. This consistency 

be initiated on the data store. The error recovery to be ^^^eck is acconjplished by checking to insure tiiat the 

taken is dependent on which part of the data store is ^^T''^:^^^ ^^^^^^ ^55 ^^^^-I^^'ot^ ^^nt« (FIG. 6 at 

damaged. A WRITE SERIALIZED operation may ^° 603) within each suffix record and the check record is 

encounter problems accessing the lock block informa- ^ h h 1^ ff"" 'T'^ ""^ 

, . , , r ■ . check record, and that the time-of-day value for single 

tion on the data store or referencing the resource data. ^^^^^^ ^^^^^ ^p,Q 6 at 604) in each suffix record ^d 

. discussion and the end of the READ SERI- ^^^^^ ^^^^^ ^ time-of-day value within 

ALIZED section for information on lock block recov- 35 ^^^^^ j^^^^ ^^^^35 Serialization global control 

^^y- information (FIG. 3 at 302A). Next, 805, a test is made 

If an error is encountered writing the resource's data . whether the data was in fact reliable. If it was reliable, 

to the pnmary or alternate data store, an attempt is return is made to the caller. If not, processing continues 

made to recreate the data in the failed data store. Refor- as shown in FIG. 8 at 807. If the test at 803 indicated 

matting (recreating) physically restructures the track or 40 that the data was not read successfully from the primary 

tracks containing the data for this resource on the data data store, a test is made at 806 whether the error was a 

store. Formatting writes are used instead of normal correctable permanent error or not. Whether or not an 

WRITE operations- See FIG. 16 for more information error is considered correctable is an indication of 

on WRITE DATA and repair processing. whether or not the entire data store is considered inac- 

45 cessible, or simply a single record is considered in error. 

UNLOCK Operation Pqj. example, a return code indicating that there was a 

The UNLOCK operation (explained more fully in channel program check would be considered an uncor- 
FIG. 17 and the accompanying text below) allows the rectable permanent error A return code indicating an 
user to cancel a previous READ SERIALIZED re- incorrect record length would be considered a correct- 
quest without updating any data on the data stores. It is permanent error. If the permanent error was con- 
also used by the WRITE SERIALIZED operation to "dered not correctable, permanent error processmg is 
complete its processing. invoked (808) to remove the nonusable primary data 

UNLOCK alters the access control information for a ™i,P'°*^^^'*"« indicated below in the descrip- 

^^r>,.^^^ ^**n»m.» ;*,r».^o*;/^« tu^ i^Air,^ti^r^ ^'On of FIG. 9. Return is then made to the caller. A test 

resource to remove trom the information the mdicaiion . , , 1. ' v . , , 

« i^4^.^*^A /c— 5^ ^ niadc ^< 807 whether a synchronized alternate exists, 

that this system is interested in the resource. (See detail /i j- * ju • j* * <*Bnx: T?Ti^ t^vTr^i. . - * *m 

u 1 \ t/*v 1 1 . *u * 1 • r (Indicated by mdicator 2506, FIG. 25.) If the test at 807 

below,) It then looks at the access control information *u«f « ^ v • j 1* . j 

^ ' . t . .V . -e L u ^ mdicates that a synchronized alternate does not exist, 

to dctermmc what other system, if any. should own the pennanent error processing is invoked as indicated 

resource next. If a next owner is found. UNLOCK will ^^^^^ g^g^ ^^^^ ^^^^ ^1^^ jf ^ 

notify that system that it is now the owner, ^ chronized alternate does exist, a channel program is 

The data store is referenced at vanous tiroes dunng bujit 809 to read the resource from the alternate data 

UNLOCK processing. While accessing the data store, ^tore. and this channel program is initiated 810. Its com- 

UNLOCK may experience problems which requires pletion is then awaited. 

that error recovery be initiated on the data store. An a test is then made 811 whether the channel program 

UNLOCK operation may encounter problems access- 65 read the data successfully. If so, 812, a determination is 

ing the lock block information on the data store. See the made whether the data just read is reliable. This deier- 

discussion and the end of the READ SERIALIZED mination is done as indicated in the description of 804 

section for information on lock block recovery. above. A test is next made if the data was reliable 813. 



06/03/2004, EAST Version: 1.4.1 



5,305,448 

15 16 

If so, return is made to the caller. If not, permanent FIG. 13. The lock block that is written by write lock 
error processing is invoked 814 to remove the nonus- block processing will contain the system sequence num- 
able primary store, and is again invoked 815 to remove bcr associated with this system, and the current TOD 
the nonusable alternate data store. Return is then made value. This lock block is indicated in FIG. 5 at 501. 
to the caller. If the test at 811 indicated that the data was 5 After return from write lock block processing, a test is 
not read successfully from the alternate dau store, per- made 1004 whether there was an uncorrectable failure 
mancnt error processing is invoked as indicated above of the primary daU store. If not, read lock block pro- 
at 814 and 815, and return is made to the caller. ccssing is invoked to read all lock blocks from the pri- 
mary data store and thus determine resource ownership 
Permanent Error Processing jq jqqj i^^^ ^Q^k processing is Ulusirated more 
FIGS. 9A and 9B illustrate control flow for pcrma- fully in FIG. 11. After return from read lock block 
ncnt error processing. At 901 the identification for the processing, a test is made 1006 whether there was an 
system encountering the permanent error is saved lb- uncorrectoble failure of the primary data store. If not, a 
cally. Next, 902, serialized requests on the system arc test is made whether all other lock blocks equal zero 
stopped by stopping the task that processes serialization 13 1007. A yes answer to the test at 1007 indicates by lock 
requests. Then a test is made at 903 whether the failure rule 1 that this system owns the resource and processing 
was a failure of the alternate data store. If so, 904, an continues at 1019 as will be described below. If the test 
indication that the alternate failed is set 2510, the opera- at 1007 indicated that all other lock blocks are not equal 
tor is notified, and the alternate is removed from use by to zero, there is contention for the resource. A test is 
turning off the functional indicator (FIG. 25 at 2505), 20 then made 1008 whether all other TODs arc older than 
Next, a test is made 905 whether the failure was a pri- this TOD. If all other TODs are older than this TOD. 
mary failure. (This test is made inuncdiatcly made after then, by lock rule 2, this system is not the owner of the 
step 903 if there was not a failure of the alternate dau resource. A test is then made at 1009 whether the re- 
store.) If there was a primary failure, 906, the operator drive TOD is equal to the request TOD. This test must 
is notified, an indication is set that the primary data 25 be made because even though this system was not the 
store failed, and it is removed from use by turning off owner when READ SERIALIZED request processing 
the functional indicator (FIG. 25 at 2502). Then, 907, read lock blocks at 1005, it is possible that lock owner 
(or following step 905 if there was no primary failure) signal processing ran asynchronously during READ 
other systems are signalled (with the data set name and SERIALIZED request processing and this system has 
volume serial of the failing data set) about this error (for 30 since become the owner of the resource. Therefore, if 
example, by use of a channel-to-channel communica- the test at 1009 indicates that the redrivc TOD is equal 
tion). Next. 908. a test is made whether an alternate data to this request TOD, we have now become the owner 
store exists and the primary has failed. If so, an indica- of the resource and processing continues as indicated 
tion that the alternate failed is set 909 and the alternate below at 1019. If the redrivc request TOD is not equal 
is removed from use by turning off the functional indi- 35 to this request TOD, this system is still not the owner of 
cator (FIG. 25 at 2505). The alternate is subsequently the resource, and the routine is exited. A lock owner 
used as the primary (910) (by moving the alternate store signal will be needed in the future. If the test at 1008 did 
descriptor information (2504) into the primary (2501)). not indicate that all other TODs are older than this 
If the test at 908 indicated that either the primary did TOD, then by lock rule 3 resource ownership is indeter- 
not fail or the alternate does not exist, a test is made 911 40 minatc and 1010 a new request TOD for a lock owner- 
whcther the alternate exists. If not, the system is tcrmi- ship is generated. Next, 1011, write lock block process- 
nated 912. If an alternate does exist (and also following ing is invoked to write a new request TOD to this sys- 
the processing described above for step 910), a test is terns lock block in the primary daU store. Write lock 
made 913 whether all systems have seen this error (by block processing is explained more fully in FIG. 13. On 
comparing the locally saved system identifiers with the 45 return from write lock block processing, a test is made 
list of participating systems in the data store*s global 1012 whether there was an uncorrectable failure of the 
control information 302). If not, 914, the other systems primary data store. If not, read lock block processing is 
which have not seen it are signalled of the permanent invoked 1013 to read all lock blocks from the primary 
error and additional signalling from these systems is data store to determine ownership. (See FIG. 11.) On 
awaited 915. If all systems have have seen the error, 50 return from read lock block processing, a test is made 
serialized requests for this system are rcsUrted 916 (by whether there was an uncorrectable error of the pri- 
restarting the task that processes serialization requests), mary data store. If not, a test is made 1015 whether the 
and the routine is exited. On an entry for a signal from redrivc TOD is equal to the request TOD for the same 
another system or as a result of a timer expiring the reasons as explained above at 1009. If these values are 
identification of this system is saved in a local area 917, 55 equal, then this system has become the owner of the 
and processing continues as indicated above at 913. resource and processing continues at 1019 as will be 

explained below. If these values are not equal, a lock 

READ SERIALIZED Request q^^^^^ ^jg^^ j^nt the system with the oldest TOD 

FIG. 10 illustrates the control flow for a READ 1016. The routine is then exited, to await a lock owner 

SERIALIZED Request. At 1001, a test is made 60 signal. 

whether any data stores exist (2501). If a primary data At 1019. read dato processing is invoked to read the 
store does not exist, return is made to the requestor. If a resource from the daU store. (Read Data Processing is 
primary does exist, a request TOD for lock ownership is outlined in FIG. 8.) After return from read data pro- 
generated 1002 and the request is queued to the lock ccssing, a test is made 1020 whether an uncorrectable 
anchor for that resource (FIG. 25 at 2503). Next, 1003. 65 failure of the primary date store occuned. If not. return 
write lock block processing is invoked to write the is made to the request caller. When lock owner signal 
request TOD to this systems lock block in the primary processing (FIG. 14 - which will be described more 
data store. This processing is explained more fully at fully below) receives a lock owner signal and dcter- 
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mines that the signa] is valid for this system, it invokes primary or the alternate data store. Return is then made 
read serialiied processing at block 1017. At 1017, a test to the caller, 
is made whether any data stores exist. If not, return is \m>T-re t r^r-v dt r^r^v t> 

made to the request caUer. If a primary data store does ^^O^K Processmg 

exist, a test is made 1018 whether there was an uncor- 5 FIG. 13 illustrates the control flow for WRITE 
rectable failure of the primary data store. If not, pro- LOCK BLOCK processing. At 1301, a channel pro- 
cessing continues at block 1019 as described above. gram is built to write lock block to the primary or alter- 

If any of the tests above (1004, 1006, 1012, 1014, 1018, nate data store. At 1302, the channel program is started 
1020) indicated an uncorrectable failure of the primary and its completion is awaited, A test is then made, 1303, 
data store, READ SERIALIZED processing is rein- 10 whether the lock block was written successfully. If so, 
voked at 1001 as explained above. return is made to the caller. If not, a test is made 1304 

whether there was a correctable permanent error. If 
Read Lock Blocks not, permanent error processing is invoked 1305 to 

FIG. 11 illustrates the control flow for read lock remove the nonusable primary or alternate data store, 
blocks processing. At 1101, a channel program is built ^5 Permanent error processing is described in FIG. 9. 
to read lock blocks from the primary or alternate data ^^^tum is then made to the caller. If there was a correct- 
store. At 1102, the channel program is started and its ^^^^ permanent error, fix lock block processing is in- 
completion is awaited. A test is then made, 1103, ^^^^^d 1306 to repair the lock blocks. FU lock block 
whether the lock blocks were read successfully. If so, processmg is described m FIG. 12. Return is then made 
return is made to the caller. If not, a test is made 1104 *° 

whether there was a correctable permanent error. If RECEIVED LOCK OWNER SIGNAL Processing 
not, permanent error processing is invoked 1105 to ,^.„ , i « , . ^ 

remove the nonusable primary or alternate data store. J^^^ illustrate the control flow for the receipt of 
Permanent error processing is described in FIG. 9. ^ ^^A^r*'^*?' ^n^^^^P^ ""[^^f ^ ^^^} 

Return is then made to the caller. If there was a correct- " ^^e 1401 whether the data store is the same as when 
able permanent error, fix lock block processing is in- sent (An ID for the pnmary data store is 

voked 1106 to repair the lock blocks Fix lock block ^"^^ ^'^^^^^ signal, and is checked agams the current 
A^^^Ziu^A • irir- in t> *, • *u j primary data store in use on this system.) If not, the 
processing is described m FIG. 12. Return IS then made j • j- ^- • i-j i i • i 

to the caller routine is exited indicating an mvalid lock owner signal 

30 and the signal is discarded. If so, a test is made whether 
Fix Lock Block Processing there are any requests pending for the given resource 

T7Ti- 11-11 * * *u * r 1- 1 1 VI 1 1402. (The lock anchor for the specific resource (FIG. 

FIG. 12 Illustrates the control flow for fix lock block 35 3533 ^^^^^^ ^ if test at 1402 indicated that 
processmg. At 1201 a ^ je whether an alternate ^^^^ ^^^^^^^ 3 ^^^^ ^^^^ 

data store exists (2505). If one does exist, the lock blocks 35 ^^ether this signal is the "newest'' signal 1403. This test 
for this resource are reformatted on the alternate data J3 ^^^^e by comparing the request TOD of the input 
store 1202. The reason the reformattmg is done of the signal to the redrive TOD in the request element (2511) 
lock blocks on the alternate data store before the lock discard "old" signals). If the request TOD is newer, 

b ocks on the pnmary data store is that fixing lock then the request TOD from the signal is saved 1404 in 
blocks IS logically a mass lock steal operation and as will 40 redrive TOD field 2511 of the lock request element, 
be shown below in lock steal processing, lock stealing (jhis is done to allow READ SERIALIZED to delect 
wntes first to the alternate data store, then to the pn- jf jj^s become the owner of the resource— FIG. 10 at 
mary data store. (See FIG. 18.) FIG. 4 shows the rela- 1009 1015.) if it is not the newest signal, then this 
tionship of the lock blocks to the resource. Since the saving is bypassed. Next, 1405, a test is made whether . 
lock blocks arc being reformatted, any previous TOD 45 read SERIALIZED processing is expecting a lock 
value in the lock blocks is lost. The number of lock owner signal. (That is, whether READ SERIALIZED 
blocks indicates the number of systems that are capable exited after receiving a NO response to the test 1009 or 
of sharing the resource— that is. the number of systems 1015 in FIG. 10.) If it was not expecting a LOCK 
that can share this particular data store. A test is then OWNER SIGNAL, the routine is exited. If such a 
made at 1203 whether the reformatting was successful 30 signal was expected, then exit is made to READ SERI- 
or not. If not, permanent error processing is invoked ALIZED lock owner signal processing (FIG. 10 at 
1204 to remove the nonusable alternate data store. Per- lOlT). If the test at 1402 indicated that there are no 
manent error processing is described at FIG. 9. Return requests for the given resource on this system (e.g., 
is then made to the caller. If the reformat was success- perhaps a redundant signal was received which no 
ful. the lock blocks for the resource on the primary data 55 longer applies) then the read lock block processing is 
store are reformatted 1205. This processing at 1205 is invoked 1406 to read all lock blocks from the primary 
also executed if the test at 1201 indicated that an alter- data store. A test is then made 1407 whether there was 
nate data store did not exist. A test is then made 1206 a failure of the primary data store in reading the lock 
whether this reformatting of lock blocks on the primary blocks. If so, the routine is exited. If not. a test is made 
data store was successful or not. If not, permanent error 60 1408 whether any system is waiting for this resource. If 
processing is invoked 1207 to remove the nonusable not, (i.e., all lock blocks are equal to zero), the routine 
primary data store. Permanent error processing is dc- is exited. If there is a system waiting for lock owner 
scribed in FIG. 9. Return is then made to the caller. If signal, then a test is made whether this system is the one 
the reformatting of the lock blocks was successful, an which is waiting 1409 (determined because each system 
indication is set that the data store failed for the request 65 knows its lock block number). If not, the lock owner 
1208. The indicator reflects the primary or alternate signal is sent to the oldest waiter, 1410, and the routine 
data store, depending on whether the read lock block is exited. The lock owner's signal will include the own- 
processing was entered to read lock blocks from the er*s request TOD (obtained from his lock block), and 
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the ID (data set name and volume serial) of the primary 
data store. If this system is the waiting system, write 
lock block processing in invoked 1411 to clear this 
system's lock block. (It is not really waiting.) (This 
implies an error condition where the data store and this 5 
system got out of synchronization.) Processing then 
continues at 1406. 

WRITE SERIALIZED Request Proccssmg 

FIG. 15 illustrates the control flow for WRITE SE- 10 
RIALIZED request processing. At 1501, a test is made 
whether there has been an uncorrectable error of the 
primary data store. If so, return is made to the requestor 
with an error indication. No write is issued. If not, write 
data processing is invoked 1502 to write data to the 15 
primary data store. Write data processing is illustrated 
in detail in FIG. 16. (The request TOD is passed to 
WRITE DATA for lock verification checking.) Next, 
1503, a test is made whether an uncorrectable failure of 
the primary data store occurred in writing the data. If 20 
so, again return is made to the requestor with an error 
indication that the write was not made. If not, a check 
is made whether the primary lock was stolen before the 
data was written to the primary 1504. If so, an error 
return is made to the caller indicating that the write was 25 
not completed. If not, a test is made whether an alter- 
nate data store exists 1505. If it does not exist, UN- 
LOCK processing is invoked 1506 to transfer control to 
unlock to complete the WRITE SERIALIZED re- 
quest. UNLOCK processing is described in more detail 30 
in FIG. 17. If an alternate data store docs exist, WRITE 
DATA processing is invoked 1507 to write data to the 
alternate data store. (A zero value is passed to WRITE 
DATA for lock verification checking.) Data processing 
is described in more detail in FIG. 16. Next, 1508, a test 35 
is made for an uncorrectable failure of the primary data 
store. If no such failure occurred, a test is made 1509 for 
an uncorrectable failure of the alternate data store. If no 
such failure occurred, a test is made 1510 whether the 
alternate lock was stolen. If it was stolen, read lock 40 
block processing is invoked 1511 to read this systems 
lock block from the alternate data store. Read lock 
block processing is described in more detail in FIG. 11. 
Next, 1512, a test is made for an uncorrectable failure of 
the primary data store. If no such failure occurred, an 45 
indication is set that alternate lock block need resetting 
1513. Next, 1514, a test for a failure of the alternate data 
store is made. If no failure occurred, a test is made 
whether the lock was stolen from this request 1515. If it 
was not, WRITE DATA processing is invoked 1516 to SO 
write data to the alternate data store. (The TOD of the 
lock just read is passed to WRITE DATA for lock 
verification checking.) WRITE DATA processing is 
described in more detail in FIG. 16. Next, the test at 
1508 is executed and processing continues from that 55 
point as indicated above. If the test at 1508 indicated 
that there was an uncorrectable failure of the primary 
data store, a test is made 1517 whether the alternate data 
store was successfully updated. If so, return is made to 
the requestor indicating a successful write completed. If 60 
not, return is made to the requestor with an error indica- 
tion that the write was not made. This same error return 
is made if the test at 1512 indicated an uncorrectable 
failure of the primary data store. 

If the test at 1510 indicated that the alternate lock was 65 
not stolen, a test is made 1518 whether the alternate lock 
block need resetting. If so, 1519, the indication of this 
need is reset, and, 1520, write lock block processing is 
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invoked to write zero to the alternate data store lock 
block. Next, (or if a NO indication to the test at 1518 
was received) UNLOCK processing is invoked 1521 to 
transfer control to unlock to complete the WRITE 
SERIALIZED request. UNLOCK processing is de- 
scribed in more detail in FIG. 17. If the test at 1515 
indicated that the lock was stolen from this request, 
write lock block processing is invoked 1522 to write 
zero to the alternate lock block. Next, 1523, a test for an 
uncorrectable failure of the primary data store is made. 
If such a failure occurred, return ts made to the re- 
questor indicating an error situation and that the write 
was not made. If there was not such failure, READ 
SERIALIZED processing is invoked 1524 to reobuin 
the resource and duplex the user changes. READ SE- 
RIALIZED processing is described in more detail in 
FIG. 10. Processing then continues with the test at 
1501, as described previously. A yes answer to the test 
at 1509, or 1514, also results in the invocation of UN- 
LCX^K processing as indicated for 1521. 

WRITE DATA 

FIG. 16 illustrates the control How for WRITE 
DATA processing. At 1601, a channel program is built 
to write the resource to the primary or alternate data 
store (the first part of the channel program verifies that 
this system's lock block still contains the key value 
provided by the caller. An **cqual" compare means that 
the lock is still owned, and the channel program can still 
continue). At 1602, the channel program is started and 
its completion is awaited. A test is then made, 1603, 
whether the data was successfully written. If so, return 
is made to the caller. If not, a test is made 1604 whether 
the lock had been stolen. If so, the "lock stolen" indica- 
tor in the lock request element (2513) is set 1605, and 
return is made to the caller. If the lock was not stolen, 
a test is made 1606 whether there was a correctable 
permanent error. If not, permanent error processing is 
invoked 1607 to remove the nonusable primary or alter- 
nate data store. Permanent error processing is described 
in more detail in FIG. 9. If the error was correctable, a 
test is made whether the error was because of a lock 
block problem, or a data write failure. If a data write 
failure, a channel program is built 1610 to reformat the 
track of the data store, and the channel program is 
started 1611 and its completion awaited. A test is then 
made 1612 whether the reformatting was successful. If 
so, return is made to the caller. If not, permanent error 
processing is invoked 1613 to remove the nonusable 
primary or alternate data store. If the test at 160S indi- 
cated a lock block problem, fix lock block processing is 
invoked 1609 to repair the lock blocks, and return is 
made to the caller. Fix lock block processing is de- 
scribed in more detail at FIG. 12. 

UNLOCK REQUEST 

FIG. 17 illustrates the control flow for UNLOCK 
request processing. At 1701 a test is made if there was 
an uncorrectable failure of the primary data store. If so, 
return is made to the request caller. If not, write lock 
block processing is invoked 1702 to zero out this sys- 
tem's lock block on the primary data store. This indi- 
cates that the resource is no longer needed by this sys- 
tem. Write lock block processing is illustrated in more 
detail in FIG. 13. Next, a test is made 1703 for an uncor- 
rectable failure of the primary data store. If such a fail- 
ure occurred, return is made to the request caller. If not, 
read lock block processing is invoked 1704 to read all 
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lock blocks from the primary data store and thus deter- 
mine resource ownership. Read lock block processing is 
described in more detail in FIO. 11. Next, 1705, a test is 
again made for an uncorrectable failure of the primary 
data store. If so, return is made to the request caller. If 
not, 1706, the oldest waiter (oldest TOD) is found 
among the lock blocks. Then, 1707, a test is made 
whether such a waiter was found. If not, return is made 
to the request caller. If so, a lock owner signal is sent to 
this oldest waiter 1708, including his request TOD (ob- 
tained from his lock block) and the ID of the data store 
(the data set name and volume serial). 

LOCK STEAL Processing 

FIG; 18 illustrates the control flow for LOCK steal 
processing. Lock steal processing is initiated as the 
result of timer driven processing (see FIG. 27) that 
examines all the lock request elements for this system to 
determine those that have been waiting for a resource 
an excessive amount of time, (The frequency of invoca- 
tion and defmition of ''excessive" must not be too short 
to unnecessarily steal, nor too long that failing systems 
impede performance of systems that are functional. In 
the preferred embodiment, the invocation interval is 6 



when a new alternate data store is to be initialized and 
placed into service. The processing is initiated with an 
operator command, or because of a signal from another 
system within a system complex which itself has re- 
S ceived an operator conunand. At 2601 a test is made 
whether this processing request is the initial request 
signal. If so, 2602, a further test is made whether the 
operator request came from the system operator of this 
particular system. If so, 2603, an indication is set that 
10 this system is responsible for synchronizing the new 
alternate data store. (Synchronized means that the alter- 
nate data store is a duplicate of the primary data store.) 
If the request was not from this system, 2604, the system 
must stop using the current alternate data store because 
15 the system which received the operator command 
would have already removed the current alternate data 
store from service. A test is then made, 2605, whether 
the new alternate data store is usable. (That is, is it 
possible to locate the volume and data set for the new 
20 alternate indicated in the operator command.) If not, 
2606, a test is made whether the request came from this 
system. If not, an "alternate failed" signal b sent to all 
other participating systems, and the routine is exited. If 
the request did come from this system, the routine is 
seconds, and the "excessive" definition is 12 seconds.) 25 simply exited. The test at 2605 indicated that the new 



At 1801, read lock block processing is invoked to read 
all lock blocks from the primary data store. Read lock 
block processing is described in more detail in FIG. 11. 
At 1802, a test is made whether the current systems lock 
block is equal to zero. (This could happen, for example, 
in the following, case: System A obtains the lock for 
. resource X; subsequently, System B attempts to access 
X. sees System A's lock block set, and waits for a lock 
owner signal— then is stopped for some reason; System 
C attempts to access X, sees A and B's lock block set, 
and waits for a lock owner signal; System A then re- 
leases X Oock block TOD now zero), and stops for 
some reason before sending a lock owner signal to B; 
System Cthen runs lock steal and, seeing B as "owner" 
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alternate is usable, a test is made 2608 whether the re- 
quest came from this system. If so, 2609, the system 
must stop using the current alternate. Then, 2610, the 
system starts using the new alternate, and marks it as not 
synchronized (indicating that it is not yet ready to be 
used as a primary data store). The synchronized indica- 
tor is illustrated in FIG. 25 at 2506. Then. 2611, a test is 
made whether the request came from this system. If not, 
an "alternate accepted" signal is sent 2612, and a timer 
is set 2613. The purpose of setting a timer is to insure 
that the system docs not have to wait for an unreason- 
able amount of time for another system to synchronize 
the alternate data store. If the test at 2611 indicated that 
the request did come from this system, a test is made 



(oldest nonzero TOD), steals the lock from B (B*s TOD 40 2614 whether this system is the only system using the 



is now zero). Finally, B is restarted, and lock steal run 
s — since B has waited a long time— and finds its TOD 
zero!) If so, READ SERIALIZED is redriven from the 
beginning, and the routine is exited. If not, the current 
owner is found 1804. (Oldest TOD from lock blocks.) A 45 
test is then made 1805 whether the current owner is this 
system. If so, a lock owner signal is sent to this system 
1806, and the routine is exited. If not, a test is made 
whether the owner is the same as the last time 1807. 
(Previous owner was saved at 1808 in 2514.) If the 50 
owner is different from the last time, the current owner 
information is saved 1808 in the request element (2514), 
and the routine is exited. If the same owner, a test is 
made whether an alternate date store exists. If so, 
WRITE lock block processing is invoked 1810 to 55 
WRITE the current owner's request TOD obtained 
from the primary data store, to the current owner*s lock 
block on the alternate data store. Then WRITE lock 
block processing is invoked 1811 to write a zero TOD 



data store. (This may be determined by the global con- 
trol information in the primary data store. See FIG. 3 at 
302.) If this is the only system using the data store, 
synchronized alternate processing is invoked 2615 to 
synchronize the alternate data store with the primary 
data store, and the routine is exited. If other systems are 
using the data store, 2616, an initial request signal is sent 
to all other systems. Then a timer is set 2617, and the 
routine is exited. 

Synchronize alternate processing is illustrated at the 
lower portion of FIG. 26B. First, 2618, a READ SERI- 
ALIZED and a WRITE SERIALIZED are issued for 
each resource on the primary data store. (The global 
control information (FIG, 3 at 302) indicates what re- 
sources are involved.) A test is then made 2619 whether 
any errors occurred in the reading or writing. If errors 
did occur, 2620, an "alternate failed" signal is sent to 
each other participating system, and the routine is ex- 
ited. If there were no errors, 2621, an "alternate func- 



to the current owner's lock block on the primary data 60 tional" system is sent to each other participating system. 



store. The new o\yner is then found 1812 (by finding the 
oldest request TOD from the primary lock blocks), and 
a lock owner is signal is sent to the new owner 1813. 
The routine is then exited. 

New Alternate Processing 

FIG. 26 illustrates the control flow for New Alter- 
nate Processing. New Alternate Processing is required 



65 



The synchronized indicator is then set, 2622 (see FIG. 
25 at 2506), and the routine is exited. 

If the test at 2601 indicated that this was not the initial 
request for new alternate processing, a test is made 2623 
whether the signal received was an "alternate ac- 
cepted" signal. If so, the identifier of the system sending 
the signal is saved 2624. ^d a test is made 2625 whether 
the current system is the system which must synchro- 
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nize the alternate data store. (Recall that the system signal, and "new alternate" processing is reinstituted at 
with this responsibility was identified and saved at 2602. If the test at 26i3 did not indicate a newer request. 
2603.) If the current system is not the system which this means that the signal must be for the current re- 
must do the synchronization, the routine is exited. If the quest. In this case, an •'alternate accepted" signal is sent 
current system is the system which must synchronize, a 3 2646. the timer is set, and the routine is exited. The 
test is made whether all participating systems have yet reason for sending the "alternate accepted" signal is 
responded 2626. If not, the routine is exited. If so. 2627, that it is possible that the previously sent signal may 
synchronize alternate processing is invoked to synchro- have been lost. 

nize the alternate data store with the primary data store. The expiration of a timer simply requires the sending 

If the test at 2623 indicated that this was not an "alter- 10 of a timer pop signal 264« . 

nate accepted" signal, a test is made 2628 whether it was EXAMPLES 

an "alternate failed" signal. If so, the new alternate data 

store is removed from use by this system 2629, and the The present invention is most readily understood in 

routine is exited. If it was not an "alternate failed" sig- the context of examples of its use in multi-system envi- 

nal, a test is made 2630 whether is was an "alternate IS ronments: 

functional" signal If so dtermU synch^^^^^^ READ/WRITE SERIALIZED Ex«nplc 

mdicator is set 2631 (sec FIG. 25 at 2506), and the rou- 

tine is exited. If it was not an alternate functional signal, FIG. 19 is an example of READ/WRITE SERIAL- 
a test is made 2632 whether it was a timer expiration I2ED processing from multiple systems for the same 
signal. If so, a lest is made at 2632.5 to determine if this 20 resource. It shows control flow on each of two systems 
system must assume responsibility for synchronizing the designated as System A and System B. At 1901 a user 
alternate data store. This system will assume rcsponsi- issues a request to READ SERIALIZED resource X. 
bility if the system currently responsible is no longer FIG. 20 at 2001, 2002, 2003 and 2004 show the state of 
active (the global control information, FIG. 3 at 302, the lock blocks at the beginning of this READ SERI- 
contains an array of systems using the data store and 25 ALIZED request. All lock block fields arc initially 
indicates if the system is active). If this system is not to zero. In response to the user request, System A's lock 
assume responsibility to synchronize the new alternate, block for resource X is wrinen at 1902. FIG. 20 at 2005 
this routine exits. If it is to assume responsibility, this is shows the contents of the lock block for System A after 
indicated at 2633. Next, 2634, a test is made whether all this step is complete. The sequence number 10 has been 
systems have responded. (Note that the global control 30 inserted into the block to designate System A, and the 
information, FIG. 3 at 302, indicates all participating time-of-day value for this request. TODl, is now en- 
systems.) An asynchronous task may monitor the condi- tercd into the time-of-day field. The other lock blocks 
tion of all participating systems, to determine whether associated with resource X, 2006, 2007 and 2008, are 
anyoneofthemisukenoutofservice— for example, by unchanged. Next, all lock blocks for resource X are 
having each participating system update a "heartbeat" 35 read 1903. Since, as noted, all lock blocks except those 
field in a commonly accessible data field at regular associated with System A are zero, by lock rule 1 Sys- 
intervals. The lack of such a heartbeat would mean that tcm A owns resource X 1904. Next, 1905, the data for 
the asynchronous task must update the global control resource X is read into storage. Then return is made to 
information to remove this system from the list of par- the READ SERIALIZED requestor, 1906, with an 
ticipating systems. 40 indication that resource X has been successfully read. 

If all participating systems have responded. 2635, Now, 1907, a user from System B issues a request to 
synchronize alternate processing is invoked to synchro- READ SERIALIZED resource X. A System B lock 
nize the alternate data store with the primary data store, block for resource X is written to the primary dau store 
and the routine is exited. If all system have not yet 1905. FIG. 20 at 2010 shows the contents of this lock 
responded. 2636, follow-up signals are sent to all partici- 45 block. The sequence number 15 is associated with Sys- 
pating system, and a timer is set 2637. (In this embodi- tcm B, and the time-of-day, TOD2, is the time of day 
meni, the timer value is set to expire after 20 seconds.) value associated with the request to write System B's 
If the test at 2632 did not indicate a timer pop signal, a lock block. All other lock blocks on the primary and 
test is made 2638 whether this signal was a follow-up alternate daU store, 2009, 2011, and 2012, are un- 
signal. (As indicated, for example, at 2636.) If not, the 50 changed. System B next reads all lock blocks for re- 
routine is exited. If it was a follow-up signal, 2639, a test source X, 1909. Applying lock rule 2, 1910, System B 
is made whether the signal is a signal with a TOD older must wait for resource X, since all the nonzero lock 
than the current request. If so, a test is made 2640 block values are older than System B's lock block value, 
whether the signal was for the current alternate data That is, TODl, 2009, is older than TOD2, 2010. System 
store. If so, an "alternate accepted" signal is sent, 2641, 55 B now waits 1911 for an indication that System B is now 
and the routine is exited. If it was not for the current the owner of resource X. During this period of waiting, 
alternate, an "alternated failed" signal is sent, 2642, and requests for other resources or records will not be de- 
thc routine is exited. If the test at 2639 indicated that the layed. 

signal was not for an older request, a test is made 2643 Next. 1912, the System A user issues a request to 
whether the TOD for this signal is for a newer than the 60 WRITE SERIALIZED resource X. After a check that 
current request. If so, 2644, a test is made whether the the resource is still owned by System A, 1913, the up- 
request is currently in progress. (Indicated by an inter- dated user data for resource X is written to the primary 

nal status indicator.) If so, 2645, an "alternate failed" data store. Since lock block A 2009 still contains a sc- 
signal is sent to fail this request (this could occur, for quence number of 10 and a TOD value of TODl, the 
example, because two systems must be trying to bring 65 resource is still owned, and the daU update is successful, 
up different alternate data stores), and the routine is Next, 1914. the updated user data for resource X is 
exited. If the request is not currently in progress, this written to the alternate data store, checking that the 
indicates that the current system lost the initial request resource is still owned by System A. Ownership implies 
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that lock block A must contain a sequence number of 
zero, and a TOD value of 0. Since this is the case. FIG. 
20 at 2011, the resource is still owned, and the data 
update is successful. Next, 1915, System A's lock block 
for resource X is written to the primary data store to 
unlock iresource X. That is, lock block A is set to all 
zeroes, FIG. 20 at 2013. Next, all lock blocks for re- 
source X are read 1916. This allows System A to fmd 
the next owner 1917. The next owner is the system 
associated with the oldest time-of-day field. In the ex- 
ample, this is the system whose sequence is 15, FIG. 20 
at 2014, i.e., System B. System A now informs System B 
that it is the owner. Finally, 1918, System A returns to 
be WRITE SERIALIZED requestor, indicating that 
the resource has been written. System B, having been 
notified by System A that it is now the owner of re- 
source X, reads the data for resource X 1919. Having 
done this, System B returns to be READ SERIAL- 
IZED requestor, indicating that the resource X has 
been read, 1920. 

Lock Steal Processing Example 1 

FIG. 21 illustrates an example of lock steal processing 
where the lock is stolen before the system successfully 
writes data to the primary data store. The example 25 
illustrates control flow on three interrelated systems, 
designated as System A, System B, and System C. 

At 2101, a user issues a request to READ SERIAL- 
IZED resource X. FIG. 22 at 2201, 2202, 2203, 2204, 
2205 and 2206 shows state of the lock blocks at the 
beginning of this READ SERIALIZED request. All 
lock block fields are initially zero. In response to the 
user request. System A*s lock block for resource X is 
written at 2102. FIG. 22 at 2207 shows the content of 
the lock block for System A after this step is complete. 
The sequence number 10 has been inserted into the 
block to designate System A, and the time-of-day value 
for this request, TODl, is now entered into the time-of- 
day field. The other lock blocks associated with re- 
source X, 2208, 2209, 2210, 2211 and 2212, are un- 
changed. Next, all lock blocks for resource X are read 
2103. Since, as noted, all lock blocks except those asso- 
ciated with System A are zero, by lock rule 1 System A 
owns resource X 2104. Next, 2105, the data for resource 
X is read into storage. Then return is made to the 45 
READ SERIALIZED requestor, 2106, with an indica- 
tion that resource X has been successfully read. Now, 

2107, a user from System B issues a request to READ 
SERIALIZED resource X. At a slightly earlier time, 

2108, a user from System C issues a request to READ SO 
SERIALIZED resource X. System C writes its lock 
block for resource X to the primary data store. This 
lock block is illustrated in FIG. 22 at 2215. System B's 
lock block for resource X is also written to the primary 
data store, FIG. 21 at 2110. FIG. 22 at 2220 illustrates 55 
the content of this lock block. System C then reads all 
lock blocks for resource X, FIG. 21 at 2111, and System 

B reads all lock blocks for resource X, 2112. System B 
discovers that all the nonzero lock block values are 



so that, applying lock rule 3, the ownership state of the 
resource X is indeterminate. System C then updates its 
lock block with a new TOD v^ue for resource X, FIG. 
21 at 2115. FIG. 22 at 2227 shows the new state of this 
lock block, with T0D4 the new TOD value. System C 
next reads all lock blocks for resource X, FIG. 21 at 
2116. System C then searches 2117 for the system with 
the oldest TOD among the lock blocks and determines 
that this is System A (TODl (FIG. 22 at 2225) is the 
oldest of the three TODs at this point). System C signals 
this ownership to System A. System A receives this 
ownership signal at 2118, but, already knowing that it is 
the owner of resource X, it ignores this signal. 

For an unrelated reason. System A is now placed into 
stop mode by the system operator, FIG. 21 at 2120. This 
will eventually cause all waiting systems, System C and 
System B, to note that they have been waiting for re- 
source X for an excessive amount of time. At 2121, 
System C detects that it has been waiting for resource X 
for an excessive amount of time, so it reads all lock 
blocks for resource X to determine the current owner. 
Similarly, 2122, System B detects that it has waited an 
excessive amount of lime for resource X, so it reads all 
lock blocks for this resource to determine the current 
owner. At 2123, and 2124, System C and System B 
determine that System A is the owner of resource X, 
(System A has the oldest TOD) and then continue to 
wait. Once again, at 2125, System C detects that it has 
waited for resource X an excessive amount of time, so 
reads all lock blocks for this resource to determine the 
current owner. Determining that System A is the cur- 
rent owner, (System A has the oldest TOD) and noting 
that System A was recorded as the owner the last time 
this system checked for excessive wait, System C initi- 
ates a steal of the lock for resource X from System A 
2126. To accomplish this steal, 2127, the sequence num- 
ber and TOD read from resource X from the primary 
data store that was associated with System A, FIG. 22 
at 2225, is now written by System C to the alternate 
data store*s lock block for resource X for System A. 
FIG. 22 at 2234 illustrates this lock block's value in the 
alternate data store. Note that the primary data store's 
lock blocks, 2231, 2232 and 2233 are unchanged, as are 
the alternate data store lock blocks for Systems B and 
C, 2235 and 2236. Next, zero values are written to the 
primary data store's lock block for resource X for Sys- 
tem A, by System C (FIG. 21 at 2128). This new pri- 
mary data store lock block associated with System A is 
illustrated in FIG. 22 at 2237. 

Next, System B detects that it has waited an excessive 
amount of time for resource X, and so reads all lock 
blocks for this resource to determine the current owner, 
FIG. 21 at 2129. System B determines that it is the 
owner of resource X (it has the oldest TOD read from 
the lock blocks) but was not the owner the last time that 
System B checked for excessive wait. Recognizing that 
it has somehow become the owner, it signals itself of 
this ownership (FIG. 21 at 2130). Meanwhile, 2131, 
System C reads all lock blocks for resource X. System 
older than its lock block value. That is, it notes that 60 C then determines which system should be the next 
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TODl. (FIG. 22 at 2219), and T0D2 (2221) are older 
than TOD3 (2220). Applying lock rule 2, System B 
must wait for resource X, until it receives a signal that 
it is the owner of this resource. During this time, re- 
quests for other resources or records will not be de- 
layed. Meanwhile, 2114, System C discovers that at 
least one system has a younger TOD than its TOD 
(TOD3 (FIG. 22 at 2220) is younger than TOD2 (2221)) 



owner of resource X, FIG. 21 at 2132. This next owner 
will be System B (oldest TOD). System C signals Sys- 
tem B indicating that it (System B) is now the owner of 
resource X. At 2133, a signal is received by System B 
65 indicating that it (System B) is now the owner of re- 
source X (System B receives a "lock owner" signal — 
see FIG. 14) at 2135 the second lock owner signal is 
received by System B for resource X which, knowing 
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already that il is the owner, discards this signal (That is, bJocks for resource X to determine the current owner, 
lock owner signal processing is exited without causing At 2313, System B determines that System A is the 
READ SERIALIZED to be invoked at the lock own- owner of resource X (oldest TOD) and then continues 
er's signal entry.) At 2134, System A's operator resurts to wait. Once again, at 2314, System B detects that it has 
the system, having accomplished whatever tasks it was 5 waited for resource X an excessive amount of time, so 
slopped for. At 2136, System B, the new owner of re- reads all lock blocks for this resource to determine the 
source X, reads the data that it wishes in for resource X, current owner. Determining that System A is the cur- 
It then returns to the READ SERIALIZED requestor, rent owner, 315, (oldest TOD) and noting that System 
indicating that resource X has been successfully read A was recorded as the owner the last time this system 
(2138). At 2137, a System A user issues a request to 10 check for excessive wait, System B initiates a steal of 
WRITE SERIALIZED resource X. The updated Sys- the lock for resource X from System A. To accomplish 
tem A user data for resource X is attempted to be writ- this steal, 2316, the sequence number and TOD read 
ten to the primary data store, checking its lock block to from resource X from the primary data store that was 
ensure that resource X is still owned by System A, 2139. associated with System A, FIG. 24 at 2409, is now 
Since the value of its lock block is no longer set to its 13 written by System B to the alternate data store's lock 
sequence number 10— rather, it has been zeroed out as block for resource X for System A. FIG. 24 at 2415 
explained previousl— see FIG. 22 at 2237. System A illustrates this lock block's value in the alternate data 
determines that it no longer owns resource X, so the store. Note that the primary data store's lock blocks, 
WRITE SERIALIZED request is not successful (FIG. 2413 and 2414. arc unchanged, as is the alternate data 
21 at 2140). and return is made to the WRITE SERI- 20 store lock block for System B 2416. 
ALIZED requestor indicating that resource X has not At 2317, System A's operator restarts the system, 
successfully been written. having accomplished whatever tasks it was stopped for. 

At 2318, a System A user issues a request to WRITE 
Lock Steal Processing Example 2 SERIALIZED resource X. The updated System A 

FIG. 23 illustrates an example of lock steal processing 25 user data for resource X is written to the primary data 
with a lock is stolen after the system successfully writes store, checking its lock block to ensure that resource X 
data to the primary data store but before the WRITE to is still owned by System A. 2319. Note that System B 
the alternate data store. has not yet zeroed out System A's lock block for re- 

At 2301, a user issues a request to READ SERIAL- source X, so that System A believes that it still owns 
IZED resource X. FIG. 24 at 2401, 2402, 2403. and 30 resource X. Now, 2320, System B writes a zero to the 
2404, shows the state of the lock blocks at the beginning primary data store's lock block for resource X for Sys- 
of this READ SERIALIZED request. All lock block tem A. FIG. 24 at 2417 depicts this new lock block, 
fields are initially zero. In response to the user request, Then, 2321, System B reads all lock blocks for resource 
System A's lock block for resource X is written at 2302. X. Now, 2322. System A attempts to write the updated 
FIG. 24 at 2405 shows the content of the lock block for 35 user data for resource X to the alternate dau store, 
System A after this step is complete. The sequence checking that the resource is still owned by System A 
number 10 has been inserted into the block to designate (see description of FIG. 16, WRITE DATA— for the 
System A, and the time-of-day value for this request, verification performed by the channel program). This 
TODl, is now entered into the time-of-day field. The write however is unsuccessful, 2323, because the lock is 
other lock block associated with resource X, 2406, 2407, 40 no longer owned by System A. 
and 2408 arc unchanged. Next, all lock blocks for re- Although the update of resource X to the alternate 
source X are read 2303. Since, as noted, all lock blocks data store was not successful, the updated data has been 
except those associated with System A arc zero, by lock written to the primary data store previously, at 2319. 
rule 1 System A owns resource X 2304. Next, 2305, the This updated daU has been exposed to other users of 
data for resource X is read into storage. Then return is 45 this resource. Therefore, it would not be correct to 
made to the READ SERIALIZED requestor, 2306. inform the System A user that the write was unsuccess- 
with an indication that resource X has been successfully ful. Rather, it will be necessary to insure that the data is 
read. Now, 2307, a user from System B issues a request duplexed, 2323. In order to do this, it is first necessary to 
to READ SERIALIZED resource X. System B's lock determine 2324 whether the lock was stolen from the 
block for resource X is written to the primary data 50 request currently being processed on this system, or 
store, FIG. 23 at 2308. FIG. 24 at 2410 illustrates the from a previous request on this system. This system's 
content of this lock block. System B next reads all lock lock block is read in for resource X and the alternate 
blocks for resource X, FIG. 23 at 2309. System B dis- lock block is cleared (see FIG. 15 starting at 1511). 
covers that all the nonzero lock block values are older Meanwhile, at 2325, the owner of resource X is being 
than its lock block value. That is, it notes that TODl 55 determined on System B (oldest TOD). System B is 
(FIG. 24 at 2409), is older than TOD2 (2410). Applying determined to be the owner, and is signalled indicating 
lock rule 2, System B must wait for resource X, until it that it is the owner of resource X. At 2326, System B 
receives a signal that it is the owner of this resource. receives its own signal that it is the owner of resource 
During this time, requests for other resources or re- X. On System A, 2327, READ SERIALIZED is in- 
cords will not be delayed. 60 voked internally as a first step toward backing up the 

For an unrelated reason. System A is now placed into user change on the alternate data store. At 2328, System 
stop mode by the system operator, FIG. 23 at 2311. This B reads data in for resource X. At 2329, as a first step in 
will eventually cause all waiting systems (in this case the READ SERIALIZED processing invoked inter- 
System B) to note that they have been waiting for re- nally, System A's lock block for resource X is written to 
source X for an excessive amount of time (see the cxpla- 65 the primary data store. The contents of this lock block 
nation for FIG. 18— Lock Steal Processing). At 2312, are illustrated in FIG. 24 at 2425, with TOD3 being the 
System B detects that it has been waiting for resource X timc-of-day of the present request. At 2330, return is 
for an excessive amount of time, so it reads all lock made to the READ SERIALIZED requestor indicai- 
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tng that resource X has been successfully read by Sys* 
tem B. At 2331, again, as part of the READ SERIAL- 
IZED process invoked internally at 2327, all lock 
blocks for resource X are read. Since all the nonzero 
lock block values are older than System A*s lock block S 
value, (T0D2 is older than TOD3), System A must 
wait for resource X by lock rule 2 (FIG. 23 at 2332). At 
2333, the user on System B issues a request to WRITE 
SERIALIZED resource X. The updated user data for 
resource X is then written to the primary data store, *0 
checking that the resource is still owned by System B 
2334 (see the description of the channel program verifi- 
cation in FIG. 16). Next, 2335, the updated user data for 
resource X is written to the alternate data store, check- 
ing that the resource is still owned by System B Next, 
2336, System B's lock block for resource X is written to 
unlock resource X (FIG. 24 at 2430). 

All lock blocks for resource X are read at 2337 and 
the next owner is determined 2338. A lock owner signal 
is sent to System A. Return is then made to the WRITE 
SERIALIZED requestor at 2339, indicating that re- 
source X has been successfully written At 2340, System 
A receives a signal that it is now the owner of resource 
X. It then reads the data in for resource X. This data 
will consist of the changes it originally made, along 
with System B*s changes. At 2341 return is made to the 
READ SERIALIZED requestor indicating that the 
resource has been successfully read. In this case, the 
requestor was WRITE SERIALIZED processing, 
since it was an internal READ SERIALIZED request 
At 2342, the (just read) data for resource X is written to 
the primary data store, checking that the resource is still 
owned by System A. Then, 2343, the data for resource 
X is written to the alternate data store, again checking 
that the resource is still owned by this system (channel 
program verification — see FIG. 16), System A*s lock 
block is then written to the primary data store as all 
zeroes to unlock resource X 2344. FIG. 24 illustrates 
this at 2433. All lock blocks for resource X are then read 4q 
at 2345, and the next owned is determined 2346. Note 
that no lock owner signal is sent since no one is waiting 
for resource X at this time. Return is then made to the 
WRITE SERIALIZED requestor with an indication 
that resource X has been successfully written 2347. 45 
Both the primary and alternate data store now contain 
the combination of System A and System B's changes. 

Although the foregoing description and the system 
illustrated in the drawings are considered to illustrate 
the preferred embodiment of the invention, various 50 
changes and modifications will occur to one skilled in 
the art without departing from the scope of the inven- 
tion. 

What is claimed is: 

1. In a multi-system central electronic complex 55 
(GEC) comprising systems each having main storage, 
system resources, an operating system for managing 
said system resources,'and shared data residing on exter- 
nal module, a method for controlling access to said 
shared data comprising the steps of: 60 

A) placing a shared resoiirce element of said shared 
data into a primary data store on the external me- 
dia; 

B) associating with said shared resource element ac- 
cess control information on said primary data store, 65 
said access control information comprising lock 
blocks, each of the lock blocks being uniquely 
associated with one of the systems; 
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C) in response to a first Read Serialized access re- 
quest by a first process on a first system, granting to 
said first process, by means of an exclusive access 
control method utilizing said access control infor- 
mation on said primary data store, exclusive access 
to the shared resource element in said primary data 
store and reading said shared resource element into 
the main storage; 

D) modifying, by said first process, said shared re- 
source element in the main storage; 

E) in response to a second Read Serialized across 
request for the shared resource element by a sec- 
ond process on a second system, said exclusive 
access control method recognizing ownership of 
said shared resource element by said first system, 
and recording said second Real Serialized access 
request in said access control information; 

F) determining, by an excessive wait detection 
method that the wait by said second process for the 
shared resource element has been excessive and 
passing exclusive control of the resource to said 
second system by a lock-stealing method, said lock- 
stealing method comprising the step of modifying 
said access control information on said primary 
data store to reflect said lock stealing; 

G) in response to a Write Serialized access request for 
the shared resource element by the first process, to 
write back said source element as modified, said 
exclusive access control method recognizing said 
modifying of said access control information on 
said primary data store, and rejecting said Write 
Serialized access request. 

2. The method of claim 1, further comprising the step 
of constructing a substantially similar copy of the pri- 
mary data store in an alternate data store, the alternate 
data store also having access control information com- 
prising lock blocks. 

3. The method for controlling access of claim 2 in 
which said exclusive access control method comprises 
the steps of: 

A) generating a lock key associated with said first 
system; 

B) writing said lock key to one of the lock blocks 
associated with said shared resource element, and 
associated with said first system, on the primary 
data store; 

C) reading all lock blocks associated with said re- 
source fi'om the primary data store; 

D) using said lock blocks to resolve ownership of said 
resource. 

4. The method of claim 3 in which said step of using 
said lock blocks comprises the steps of: 

A) identifying said first system as owner of said 
shared resource element of the primary data store if 
all lock blocks associated with said resource, ex- 
cept those also associated with said first system, are 
zero; 

B) identifying said first system as not owner of said 
shared resource element on the primary data store 
if at least one lock block associated with said 
shared resource element, but not associated with 
said first system, is nonzero, and each of said non- 
zero lock blocks has a time-of-day value older than 
that of said first system; and having said first system 
wait for notice of ownership of said shared re- 
source element; 

C) if steps (A) and (B) do not apply, identifying an 
indeterminate ownership situation; generating, by 
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said first system, a new time-of-day value; updating 
with said new tiroe-of-day value all lock blocks 
associated with said shared resource element and 
with said first system; reading all lock blocks asso- 
ciated with said shared resource element; determin- 
ing which of said read lock system associated with 
the one of said read lock blocks with the oldest 
time-of-day value that said second system is now 
owner of said shared resource element. 

5. The method of claim 4 in which the lock-stealing 
method comprises the steps of: 

A) writing the lock block associated with the primary 
data store, associated with a former resource 
owner to the secondary dau store; 

B) zeroing the lock block associated with the former 
resource owner in the primary data store; 

C) determining which waiting system has been wait- 
ing the longest for the resource, and signalling said 
waiting system that it has exclusive control of the ^ 
shared resource element. 

6. The method of claim 2 further comprising the steps 
of: 

A) determining that the primary dau source has a 
defective record by comparing a check record 
with a suffix record associated with the defective 
record; 

B) performing local error correction on the primary 
data store using said alternate data store. 

7. The method of claim 2 further comprising the steps 
of: 

A) determining that the primary data store is uncor- 
rectably in error; 

B) substituting the alternate data store for said pri- 
mary data store; 

C) dynamically creating a new alternate data store. 

8. In a multi-system central electronic complex 
(CEC) comprising systems each having main storage, 
system resources, an operating system for managing 
said system resources, and shared data residing on exter- 
nal media, a serialization mechanism for controlling 
access to said shared data comprising: 

A) a primary data store comprising a shared resource 45 
clement having shared data; 

B) within said primary data store, exclusive access 
means for a first system in said complex acquiring 
exclusive access to the shared resource element on 
the primary data store; 50 



C) a global control element within said primary data 
store; 

D) Read Serialized means for requesting and obtain- 
ing exclusive ownership of said shared resource 
element by t requesting process in said first system, 
comprising: 

i) update means for updating said exclusive access 
means with a record of said requesting; 

ii) notify means for notifying said requesting pro- 
cess of success or failure of said requesting; 

E) lock steal means for stealing exclusive ownership 
of said shared recourse element from said first sys- 
tem, said lock steal means comprising: 

i) detection means for determining that said fu-st 
system is failed or temporarily stopped, and 

ii) preemption means for a second system preempt- 
ing said first system and passing to a waiting 
system exclusive access to the resource of said 
shared daU when said first system is failed or 
temporarily stopped, said preemption means 
updating said exclusive access means to indicate 
said stealing; 

F) Write Serialized means for writing back exclu- 
sively owned resources to said primary data store, 
comprising: 

i) verify means for checking said exclusive access 
means to determine whether said stealing has 
been indicated, and 

ii) notify means to notify said requesting process of 
a request rejected when said stealing has been 
indicated. 

9. The multi-system CEC of claim 8 further compris- 
ing an alternate data store substantially similar to the 
primary data store. 

10. The multi-system CED of claim 9 in which the 
exclusive access means comprises a lock block having a 
system-related lock key associated with each shared 
resource clement on the primary data store, said lock 
key having a system sequence number field and a time- 

40 of-day value field, said fields associated with a particu- 
lar system having zero values on the primary data store 
when the particular system does not require the shared 
resource element; and having nonzero values on the 
primary data store when the particular system acquires 
exclusive control of the shared resource element. 

11. The multi-system CED of claim 8 in which the 

detection means comprises a timer-driven routine for 

comparing the waiting system's wait time against an 

"excessive wait" interval. 

* * ♦ • * 
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